System and method for semantically exploring concepts

ABSTRACT

A method for detecting and categorizing topics in a plurality of interactions includes: extracting, by a processor, a plurality of fragments from the plurality of interactions; filtering, by the processor, the plurality of fragments to generate a filtered plurality of fragments; clustering, by the processor, the filtered fragments into a plurality of base clusters; and clustering, by the processor, the plurality of base clusters into a plurality of hyper clusters.

FIELD

Aspects of the present invention relate to performing analytics oncommunications. In particular, aspects of the present invention relateto analyzing recorded and live information to categorize conversationsand to identify concepts and trends.

BACKGROUND

An organization's contact center typically receives a multitude ofcommunications or interactions (e.g., calls, text chat messages, emailmessages, social media messages, etc.) regarding a variety of issues.For example, a sales department of a contact center may take part ininteractions involving questions about the feature sets and pricing ofvarious products offered by the organization; a customer supportdepartment may interact with customers to discuss particular problemswith using the products or the quality of the services being delivered;an accounts department may field interactions about changes in billingpolicy, incorrect charges, and other issues.

Generally, it is useful for an organization to be able to identifyconcepts and patterns within the conversations (or “interactions”) inorder to categorize the calls and identify underlying issues to beaddressed (e.g., specific complaints about products or generaldissatisfaction with services). However, conventional systems for doingso generally involve the manual survey of data collected by customersupport agents and manual analysis of this data. This manual process ofanalysis can be time consuming and there may be long delays betweencollecting the data and determining results from the analysis.

In some conventional systems, conversations can be tagged or categorizedbased on their containing predefined keywords or phrases. For example,through the above discussed manual (human) analysis of phrases that areeither identified by a human listener or identified by a computer systemusing phrase recognition, one might infer that conversations with a callcenter that contain the phrases “I would like to speak to your manager”and “Can I talk to your supervisor?” lead to the escalation of the callto a higher level representative. As such, any call containing thesephrases would be categorized as containing an “escalation attempt.”

As such, an organization can identify trends and infer conditions basedon the number of such interactions falling into various categories. Forexample, a large number of interactions originating from a particulararea and categorized as indicating a “service outage” or “poor networkperformance” could alert an internet service provider to take action toaddress system problems within that particular area.

However, conversations containing phrases that were not previouslyidentified would not be categorized appropriately. For example, if thephrase “Let me talk to your boss” was not previously identified as beingassociated with escalation attempts, then a conversation containing thatphrase would not be categorized as an “escalation attempt.”

In addition, some conventional systems use Bayesian networks to identifycorrelations between events. However, developing these Bayesian networksrequires significant human input to specify various parameters (e.g.,the nodes of the Bayesian network).

SUMMARY

Aspects of embodiments of the present invention are directed to systemsand methods for the discovery and exploration of topics and categorieswithin a set of data. One aspect of the present invention is directed tothe automatic discovery and extraction, without human assistance, oftopics or concepts having similar meaning and semantics from a set ofdocuments. The discovered and extracted topics can further be clusteredinto “parent categories,” where each parent category contains one ormore “base topics,” thereby creating a hierarchical taxonomy.

The structure of parent categories that contain topics (or sub-topics)can be used to generate a global taxonomy of all the semantic issues andconcepts identified in the domain. Such a global taxonomy can then bevisualized for easy navigation by users and further analysis of currenttrends and issues.

According to one embodiment of the present invention, a method fordetecting and categorizing topics in a plurality of interactionsincludes: extracting, by a processor, a plurality of fragments from theplurality of interactions; filtering, by the processor, the plurality offragments to generate a filtered plurality of fragments; clustering, bythe processor, the filtered fragments into a plurality of base clusters;and clustering, by the processor, the plurality of base clusters into aplurality of hyper clusters.

The extracting the plurality of fragments from the plurality ofinteractions may include: receiving, by the processor, textcorresponding to the plurality of interactions; tagging, by theprocessor, portions of the text based on parts of speech; andextracting, by the processor, fragments from the text in accordance withone or more extraction rules.

The text corresponding to the plurality of interactions may include anoutput of an automatic speech recognition engine, the output beinggenerated by processing at least one of the plurality of interactionsthrough the automatic speech recognition engine.

The one or more extraction rules may include a part of speech sequence.

The one or more extraction rules may be automatically generated by theprocessor based on a plurality of manually extracted fragments.

The method may further include labeling, by the processor, a basecluster of the plurality of base clusters, the labeling including:extracting, by the processor, a plurality of noun phrases from the basecluster; computing, by the processor, a distribution of probabilities ofstems of the noun phrases; and identifying, by the processor, a labelnoun phrase of the noun phrases, the label noun phrase having a highestprobability based on the stem distribution.

The clustering the plurality of base clusters into the plurality ofhyper clusters may include: computing, by the processor, a plurality ofsemantic distances between pairs of the plurality of base clusters; andclustering, by the processor, the base clusters into the hyper clustersin accordance with the semantic distances.

The plurality of semantic distances may be computed based on semanticsimilarities of the pairs of base clusters and co-occurrence offragments in the pairs of base clusters.

The method may further include: generating, by the processor, avisualization of the plurality of topics as organized into a hierarchybased on the plurality of hyper clusters, at least one of the hyperclusters including a plurality of corresponding base clusters, each ofthe base clusters including a corresponding plurality of fragments.

According to one embodiment of the present invention, a system includes:a processor; and a memory, wherein the memory has stored thereoninstructions that, when executed by the processor, cause the processorto: receive a plurality of interactions; extract a plurality offragments from the plurality of interactions; filter the plurality offragments to generate a filtered plurality of fragments; cluster thefiltered fragments into a plurality of base clusters; and cluster theplurality of base clusters into a plurality of hyper clusters.

The instructions that cause the processor to extract the plurality offragments from the plurality of interactions may include instructionsthat, when executed by the processor, cause the processor to: receivetext corresponding to the plurality of interactions; tag portions of thetext based on parts of speech; and extract fragments from the text inaccordance with one or more extraction rules.

The text corresponding to the plurality of interactions may include anoutput of an automatic speech recognition engine, the output beinggenerated by processing at least one of the plurality of interactionsthrough the automatic speech recognition engine.

The one or more extraction rules may include a part of speech sequence.

The memory may further have stored thereon instructions that, whenexecuted by the processor, cause the processor to generate the one ormore extraction rules based on a plurality of manually extractedfragments.

The memory may further have stored thereon instructions that, whenexecuted by the processor, cause the processor to label a base clusterof the plurality of base clusters by: extracting a plurality of nounphrases from the base cluster; computing a distribution of probabilitiesof stems of the noun phrases; and identifying a label noun phrase of thenoun phrases, the label noun phrase having a highest probability basedon the stem distribution.

The instructions that cause the processor to cluster the plurality ofbase clusters into the plurality of hyper clusters may includeinstructions that, when executed by the processor, cause the processorto: compute a plurality of semantic distances between pairs of theplurality of base clusters; and cluster the base clusters in into thehyper clusters in accordance with the semantic distances.

The instructions that cause the processor to compute the plurality ofsemantic distances between the pairs of the base clusters may includeinstructions to compute a semantic distance of the semantic distancesbased on semantic similarities between the pairs of the base clustersand co-occurrence of fragments in the pairs of the base clusters.

The memory may further have stored thereon instructions that, whenexecuted by the processor, cause the processor to generate avisualization of a plurality of topics as organized into a hierarchybased on the plurality of hyper clusters, at least one of the hyperclusters including a plurality of corresponding base clusters, each ofthe base clusters including a corresponding plurality of fragments.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, together with the specification, illustrateexemplary embodiments of the present invention, and, together with thedescription, serve to explain the principles of the present invention.

FIG. 1 is a schematic block diagram of a system supporting a contactcenter that is configured to provide access to searchable transcripts tocustomer service agents according to one exemplary embodiment of theinvention.

FIG. 2A is a block diagram of a computing device according to anembodiment of the present invention.

FIG. 2B is a block diagram of a computing device according to anembodiment of the present invention.

FIG. 2C is a block diagram of a computing device according to anembodiment of the present invention.

FIG. 2D is a block diagram of a computing device according to anembodiment of the present invention.

FIG. 2E is a block diagram of a network environment including severalcomputing devices according to an embodiment of the present invention.

FIG. 3 is a screenshot of a category distribution report according toone embodiment of the present invention.

FIG. 4 is a screenshot illustrating an interface for customizing anddefining predefined categories according to one embodiment of thepresent invention.

FIG. 5 is a screenshot illustrating an interface for exploringrelationships between topics in a plurality of interactions according toone embodiment of the present invention.

FIG. 6A is a screenshot illustrating an interface depicting therelationship of identified topics in a large of documents as a taxonomy(e.g., “global taxonomy”) according to one embodiment of the presentinvention.

FIGS. 6B and 6C are screenshots illustrating an interface depictingclusters within the larger taxonomy according to one embodiment of thepresent invention.

FIG. 7 is a flowchart illustrating a process for identifying topics andgenerating a taxonomy according to one embodiment of the presentinvention.

FIG. 8A is a flowchart illustrating a method for extracting fragmentsfrom within interactions according to one embodiment of the presentinvention.

FIG. 8B is a block diagram illustrating a system for extractingfragments from within interactions according to one embodiment of thepresent invention.

FIG. 8C is a flowchart illustrating a method for automaticallygenerating extraction rules according to one embodiment of the presentinvention.

FIG. 9A is a flowchart illustrating a method for automaticallygenerating hyper clusters of fragments according to one embodiment ofthe present invention.

FIG. 9B is a flowchart illustrating a method for automaticallygenerating base clusters of fragments according to one embodiment of thepresent invention.

FIG. 9C is a flowchart illustrating a method for automaticallygenerating labels for base clusters of fragments according to oneembodiment of the present invention.

FIG. 10A is a flowchart illustrating a method for hyper clustering baseclusters according to one embodiment of the present invention.

FIG. 10B is a flowchart illustrating a method for calculating a semanticdistance between two base clusters according to one embodiment of thepresent invention.

FIG. 10C is a block diagram illustrating a system for hyper clusteringbase clusters according to one embodiment of the present invention.

DETAILED DESCRIPTION

In the following detailed description, only certain exemplaryembodiments of the present invention are shown and described, by way ofillustration. As those skilled in the art would recognize, the inventionmay be embodied in many different forms and should not be construed asbeing limited to the embodiments set forth herein. Like referencenumerals designate like elements throughout the specification.

As described herein, various applications and aspects of the presentinvention may be implemented in software, firmware, hardware, andcombinations thereof. When implemented in software, the software mayoperate on a general purpose computing device such as a server, adesktop computer, a tablet computer, a smartphone, or a personal digitalassistant. Such a general purpose computer includes a general purposeprocessor and memory.

Some embodiments of the present invention will be described in thecontext of a contact center. However, embodiments of the presentinvention are not limited thereto and may also be used in under otherconditions involving searching recorded audio such as in computer basededucation systems, voice messaging systems, medical transcripts, or anyspeech corpora from any source.

Aspects of embodiments of the present invention are directed to a systemand method for automatically inferring and deducing topics of discussion(or “concepts”) from a body of recorded or live interactions (orconversations). These interactions may include, for example, telephoneconversations, text-based chat sessions, email conversation threads, andthe like. The inferring of these concepts does not require manualcategorization by a human and can be performed by the system (or the“analytics system”) according to embodiments of the present invention.Therefore, new, previously unidentified topics of conversation canquickly be identified and brought to the attention of an organizationwithout performing a manual analysis of conversation logs.

For example, suppose a company released a new version product that addedBluetooth® connectivity and there were no predefined categories in theinteractions analytics system to match the phrases “Bluetoothconnection” or “Bluetooth pairing” to issues with Bluetooth®connections. In conventional systems, this category might go undetecteduntil those phrases were manually added to the analytics system.

In contrast, embodiments of the present invention are directed to asystem and method for identifying salient phrases, generating newcategories (or “concepts” or “topics”) based on these identifiedphrases, and categorizing interactions based on these automaticallyidentified categories. As a result, embodiments of the present inventioncan be used to alert organizations to newly trending topics withininteractions (e.g., conversations with customers), thereby allowingfaster responses to changing circumstances. See, e.g., FIG. 3, which isa screenshot of a portion of a category distribution report 1 showingexemplary categories “New Customer,” “Emergency,” “Identification,”“Billing,” and “Payment Inquiry” along the number of interactionscategorized into each of these categories and the percentages of allcalls that involve these categories. Note that the percentages add up tomore than 100% because any given interaction may be assigned to multiplecategories or not assigned to any category. Viewing this categorydistribution report, an organization can assess the most frequentlydiscussed topics.

In addition, embodiments of the present invention are directed to asystem and method for generating a taxonomy of topics and displaying thetaxonomy in a way that allow for easier analysis of current patterns ininteractions. See, e.g., FIG. 6, in which a global taxonomy of topics isshown as a collection of nested and clustered circles.

Therefore, embodiments of the present invention are directed to systemsand methods for providing timely summary of trends in topics ofdiscussion in a collection of interactions and systems and methods forgenerating and displaying a taxonomy of topics.

In one embodiment, the above-described systems and methods are used inthe context of a contact center and are used to monitor and infer topicsof conversation during interactions between customers and anorganization, as the topics may be organized into a taxonomy.

FIG. 1 is a schematic block diagram of a system supporting a contactcenter that is configured to provide customer availability informationto customer service agents according to one exemplary embodiment of theinvention. The contact center may be an in-house facility to a businessor corporation for serving the enterprise in performing the functions ofsales and service relative to the products and services availablethrough the enterprise. In another aspect, the contact center may be athird-party service provider. The contact center may be hosted inequipment dedicated to the enterprise or third-party service provider,and/or hosted in a remote computing environment such as, for example, aprivate or public cloud environment with infrastructure for supportingmultiple contact centers for multiple enterprises.

According to one exemplary embodiment, the contact center includesresources (e.g. personnel, computers, and telecommunication equipment)to enable delivery of services via telephone or other communicationmechanisms. Such services may vary depending on the type of contactcenter, and may range from customer service to help desk, emergencyresponse, telemarketing, order taking, and the like.

Customers, potential customers, or other end users (collectivelyreferred to as customers) desiring to receive services from the contactcenter may initiate inbound calls to the contact center via their enduser devices 10 a-10 c (collectively referenced as 10). Each of the enduser devices 10 may be a communication device conventional in the art,such as, for example, a telephone, wireless phone, smart phone, personalcomputer, electronic tablet, and/or the like. Users operating the enduser devices 10 may initiate, manage, and respond to telephone calls,emails, chats, text messaging, web-browsing sessions, and othermulti-media transactions.

Inbound and outbound calls from and to the end users devices 10 maytraverse a telephone, cellular, and/or data communication network 14depending on the type of device that is being used. For example, thecommunications network 14 may include a private or public switchedtelephone network (PSTN), local area network (LAN), private wide areanetwork (WAN), and/or public wide area network such as, for example, theInternet. The communications network 14 may also include a wirelesscarrier network including a code division multiple access (CDMA)network, global system for mobile communications (GSM) network, and/orany 3G or 4G network conventional in the art.

According to one exemplary embodiment, the contact center includes aswitch/media gateway 12 coupled to the communications network 14 forreceiving and transmitting calls between end users and the contactcenter. The switch/media gateway 12 may include a telephony switchconfigured to function as a central switch for agent level routingwithin the center. In this regard, the switch 12 may include anautomatic call distributor, a private branch exchange (PBX), an IP-basedsoftware switch, and/or any other switch configured to receiveInternet-sourced calls and/or telephone network-sourced calls. Accordingto one exemplary embodiment of the invention, the switch is coupled to acall server 18 which may, for example, serve as an adapter or interfacebetween the switch and the remainder of the routing, monitoring, andother call-handling systems of the contact center.

The contact center may also include a multimedia/social media server forengaging in media interactions other than voice interactions with theend user devices 10 and/or web servers 32. The media interactions may berelated, for example, to email, vmail (voice mail through email), chat,video, text-messaging, web, social media, screen-sharing, and the like.The web servers 32 may include, for example, social interaction sitehosts for a variety of known social interaction sites to which an enduser may subscribe, such as, for example, Facebook, Twitter, and thelike. The web servers may also provide web pages for the enterprise thatis being supported by the contact center. End users may browse the webpages and get information about the enterprise's products and services.The web pages may also provide a mechanism for contacting the contactcenter, via, for example, web chat, voice call, email, web real timecommunication (WebRTC), or the like.

According to one exemplary embodiment of the invention, the switch iscoupled to an interactive voice response (IVR) server 34. The IVR server34 is configured, for example, with an IVR script for querying customerson their needs. For example, a contact center for a bank may tellcallers, via the IVR script, to “press 1” if they wish to get an accountbalance. If this is the case, through continued interaction with theIVR, customers may complete service without needing to speak with anagent.

If the call is to be routed to an agent, the call is forwarded to thecall server 18 which interacts with a routing server 20 for finding anappropriate agent for processing the call. The call server 18 may beconfigured to process PSTN calls, VoIP calls, and the like. For example,the call server 18 may include a session initiation protocol (SIP)server for processing SIP calls.

In one example, while an agent is being located and until such agentbecomes available, the call server may place the call in, for example, acall queue. The call queue may be implemented via any data structureconventional in the art, such as, for example, a linked list, array,and/or the like. The data structure may be maintained, for example, inbuffer memory provided by the call server 18.

Once an appropriate agent is available to handle a call, the call isremoved from the call queue and transferred to a corresponding agentdevice 38 a-38 c (collectively referenced as 38). Collected informationabout the caller and/or the caller's historical information may also beprovided to the agent device for aiding the agent in better servicingthe call. In this regard, each agent device 38 may include a telephoneadapted for regular telephone calls, VoIP calls, and the like. The agentdevice 38 may also include a computer for communicating with one or moreservers of the contact center and performing data processing associatedwith contact center operations, and for interfacing with customers via avariety of communication mechanisms such as chat, instant messaging,voice calls, and the like.

The selection of an appropriate agent for routing an inbound call may bebased, for example, on a routing strategy employed by the routing server20, and further based on information about agent availability, skills,and other routing parameters provided, for example, by a statisticsserver 22. According to one exemplary embodiment of the invention, thestatistics server 22 includes a customer availability aggregation (CAA)module 36 for monitoring availability of end users on differentcommunication channels and providing such information to, for example,the routing server 20, agent devices 38 a-38 c, and/or other contactcenter applications and devices. The CAA module may also be deployed ina separate application server. The aggregation module 36 may be asoftware module implemented via computer program instructions which arestored in memory of the statistics server 22 (or some other server), andwhich program instructions are executed by a processor. A person ofskill in the art should recognize that the aggregation module 36 mayalso be implemented via firmware (e.g. an application-specificintegrated circuit), hardware, or a combination of software, firmware,and hardware.

According to one exemplary embodiment, the aggregation module 36 isconfigured to receive customer availability information from otherdevices in the contact center, such as, for example, themultimedia/social media server 24. For example, the multimedia/socialmedia server 24 may be configured to detect user presence on differentwebsites including social media sites, and provide such information tothe aggregation module 36. The multimedia/social media server 24 mayalso be configured to monitor and track interactions on those websites.

The multimedia/social media server 24 may also be configured to provide,to an end user, a mobile application 40 for downloading onto the enduser device 10. The mobile application 40 may provide user configurablesettings that indicate, for example, whether the user is available, notavailable, or availability is unknown, for purposes of being contactedby a contact center agent. The multimedia/social media server 24 maymonitor the status settings and send updates to the aggregation moduleeach time the status information changes.

The contact center may also include a reporting server 28 configured togenerate reports from data aggregated by the statistics server 22. Suchreports may include near real-time reports or historical reportsconcerning the state of resources, such as, for example, average waitingtime, abandonment rate, agent occupancy, and the like. The reports maybe generated automatically or in response to specific requests from arequestor (e.g. agent/administrator, contact center application, and/orthe like).

According to one exemplary embodiment of the invention, the routingserver 20 is enhanced with functionality for managingback-office/offline activities that are assigned to the agents. Suchactivities may include, for example, responding to emails, responding toletters, attending training seminars, or any other activity that doesnot entail real time communication with a customer. Once assigned to anagent, an activity an activity may be pushed to the agent, or may appearin the agent's workbin 26 a-26 c (collectively referenced as 26) as atask to be completed by the agent. The agent's workbin may beimplemented via any data structure conventional in the art, such as, forexample, a linked list, array, and/or the like. The workbin may bemaintained, for example, in buffer memory of each agent device 38.

According to one exemplary embodiment of the invention, the contactcenter also includes one or more mass storage devices 30 for storingdifferent databases relating to agent data (e.g. agent profiles,schedules, etc.), customer data (e.g. customer profiles), interactiondata (e.g. details of each interaction with a customer, including reasonfor the interaction, disposition data, time on hold, handle time, etc.),and the like. According to one embodiment, some of the data (e.g.customer profile data) may be provided by a third party database suchas, for example, a third party customer relations management (CRM)database. The mass storage device may take form of a hard disk or diskarray as is conventional in the art.

According to one embodiment of the present invention, the contact center102 also includes a call recording server 40 for recording the audio ofcalls conducted through the contact center 102, a call recording storageserver 42 for storing the recorded audio, a speech analytics server 44configured to process and analyze audio collected in the from thecontact center 102, and a speech index database 46 for providing anindex of the analyzed audio.

The speech analytics server 44 may be coupled to (or may include) ananalytics server 45 including a topic detecting module 45 a, a rootcause mining module 45 b, and a user interface module 45 d. Theanalytics server 45 may be configured to provide the automatic detectionof topics from interactions recorded by the call recording server 40 andstored on the call recording storage server 42. The analytics server 45may also access data stored on, for example, the multimedia/social mediaserver 24 in order to process interactions from various chat, socialmedia, email, and other non-voice interactions.

The various servers of FIG. 1 may each include one or more processorsexecuting computer program instructions and interacting with othersystem components for performing the various functionalities describedherein. The computer program instructions are stored in a memoryimplemented using a standard memory device, such as, for example, arandom access memory (RAM). The computer program instructions may alsobe stored in other non-transitory computer readable media such as, forexample, a CD-ROM, flash drive, or the like. Also, although thefunctionality of each of the servers is described as being provided bythe particular server, a person of skill in the art should recognizethat the functionality of various servers may be combined or integratedinto a single server, or the functionality of a particular server may bedistributed across one or more other servers without departing from thescope of the embodiments of the present invention.

Each of the various servers in the contact center may be a process orthread, running on one or more processors, in one or more computingdevices 500 (e.g., FIG. 2A, FIG. 2B), executing computer programinstructions and interacting with other system components for performingthe various functionalities described herein. The computer programinstructions are stored in a memory which may be implemented in acomputing device using a standard memory device, such as, for example, arandom access memory (RAM). The computer program instructions may alsobe stored in other non-transitory computer readable media such as, forexample, a CD-ROM, flash drive, or the like. Also, a person of skill inthe art should recognize that a computing device may be implemented viafirmware (e.g. an application-specific integrated circuit), hardware, ora combination of software, firmware, and hardware. A person of skill inthe art should also recognize that the functionality of variouscomputing devices may be combined or integrated into a single computingdevice, or the functionality of a particular computing device may bedistributed across one or more other computing devices without departingfrom the scope of the exemplary embodiments of the present invention. Aserver may be a software module, which may also simply be referred to asa module. The set of modules in the contact center may include serversand other modules.

FIG. 2A and FIG. 2B depict block diagrams of a computing device 500 asmay be employed in exemplary embodiments of the present invention. Asshown in FIG. 2A and FIG. 2B, each computing device 500 includes acentral processing unit 521, and a main memory unit 522. As shown inFIG. 2A, a computing device 500 may include a storage device 528, aremovable media interface 516, a network interface 518, an input/output(I/O) controller 523, one or more display devices 530 c, a keyboard 530a and a pointing device 530 b, such as a mouse. The storage device 528may include, without limitation, storage for an operating system andsoftware. As shown in FIG. 2B, each computing device 500 may alsoinclude additional optional elements, such as a memory port 503, abridge 570, one or more additional input/output devices 530 d, 530 e anda cache memory 540 in communication with the central processing unit521. Input/output devices, e.g., 530 a, 530 b, 530 d, and 530 e, may bereferred to herein using reference numeral 530.

The central processing unit 521 is any logic circuitry that responds toand processes instructions fetched from the main memory unit 522. It maybe implemented, for example, in an integrated circuit, in the form of amicroprocessor, microcontroller, or graphics processing unit (GPU), orin a field-programmable gate array (FPGA) or application-specificintegrated circuit (ASIC). Main memory unit 522 may be one or morememory chips capable of storing data and allowing any storage locationto be directly accessed by the central processing unit 521. In theembodiment shown in FIG. 2A, the central processing unit 521communicates with main memory 522 via a system bus 550. FIG. 2B depictsan embodiment of a computing device 500 in which the central processingunit 521 communicates directly with main memory 522 via a memory port503.

FIG. 2B depicts an embodiment in which the central processing unit 521communicates directly with cache memory 540 via a secondary bus,sometimes referred to as a backside bus. In other embodiments, thecentral processing unit 521 communicates with cache memory 540 using thesystem bus 550. Cache memory 540 typically has a faster response timethan main memory 522. In the embodiment shown in FIG. 2A, the centralprocessing unit 521 communicates with various I/O devices 530 via alocal system bus 550. Various buses may be used as a local system bus550, including a Video Electronics Standards Association (VESA) Localbus (VLB), an Industry Standard Architecture (ISA) bus, an ExtendedIndustry Standard Architecture (EISA) bus, a MicroChannel Architecture(MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI Extended(PCI-X) bus, a PCI-Express bus, or a NuBus. For embodiments in which anI/O device is a display device 530 c, the central processing unit 521may communicate with the display device 530 c through an AdvancedGraphics Port (AGP). FIG. 2B depicts an embodiment of a computer 500 inwhich the central processing unit 521 communicates directly with I/Odevice 530 e. FIG. 2B also depicts an embodiment in which local bussesand direct communication are mixed: the central processing unit 521communicates with I/O device 530 d using a local system bus 550 whilecommunicating with I/O device 530 e directly.

A wide variety of I/O devices 530 may be present in the computing device500. Input devices include one or more keyboards 530 a, mice, trackpads,trackballs, microphones, and drawing tablets. Output devices includevideo display devices 530 c, speakers, and printers. An I/O controller523, as shown in FIG. 2A, may control the I/O devices. The I/Ocontroller may control one or more I/O devices such as a keyboard 530 aand a pointing device 530 b, e.g., a mouse or optical pen.

Referring again to FIG. 2A, the computing device 500 may support one ormore removable media interfaces 516, such as a floppy disk drive, aCD-ROM drive, a DVD-ROM drive, tape drives of various formats, a USBport, a Secure Digital or COMPACT FLASH™ memory card port, or any otherdevice suitable for reading data from read-only media, or for readingdata from, or writing data to, read-write media. An I/O device 530 maybe a bridge between the system bus 550 and a removable media interface516.

The removable media interface 516 may for example be used for installingsoftware and programs. The computing device 500 may further include astorage device 528, such as one or more hard disk drives or hard diskdrive arrays, for storing an operating system and other relatedsoftware, and for storing application software programs. Optionally, aremovable media interface 516 may also be used as the storage device.For example, the operating system and the software may be run from abootable medium, for example, a bootable CD.

In some embodiments, the computing device 500 may include or beconnected to multiple display devices 530 c, which each may be of thesame or different type and/or form. As such, any of the I/O devices 530and/or the I/O controller 523 may include any type and/or form ofsuitable hardware, software, or combination of hardware and software tosupport, enable or provide for the connection to, and use of, multipledisplay devices 530 c by the computing device 500. For example, thecomputing device 500 may include any type and/or form of video adapter,video card, driver, and/or library to interface, communicate, connect orotherwise use the display devices 530 c. In one embodiment, a videoadapter may include multiple connectors to interface to multiple displaydevices 530 c. In other embodiments, the computing device 500 mayinclude multiple video adapters, with each video adapter connected toone or more of the display devices 530 c. In some embodiments, anyportion of the operating system of the computing device 500 may beconfigured for using multiple display devices 530 c. In otherembodiments, one or more of the display devices 530 c may be provided byone or more other computing devices, connected, for example, to thecomputing device 500 via a network. These embodiments may include anytype of software designed and constructed to use the display device ofanother computing device as a second display device 530 c for thecomputing device 500. One of ordinary skill in the art will recognizeand appreciate the various ways and embodiments that a computing device500 may be configured to have multiple display devices 530 c.

A computing device 500 of the sort depicted in FIG. 2A and FIG. 2B mayoperate under the control of an operating system, which controlsscheduling of tasks and access to system resources. The computing device500 may be running any operating system, any embedded operating system,any real-time operating system, any open source operating system, anyproprietary operating system, any operating systems for mobile computingdevices, or any other operating system capable of running on thecomputing device and performing the operations described herein.

The computing device 500 may be any workstation, desktop computer,laptop or notebook computer, server machine, handheld computer, mobiletelephone or other portable telecommunication device, media playingdevice, gaming system, mobile computing device, or any other type and/orform of computing, telecommunications or media device that is capable ofcommunication and that has sufficient processor power and memorycapacity to perform the operations described herein. In someembodiments, the computing device 500 may be a virtualized computingdevice and the virtualized computing device may be running in anetworked or cloud based environment. In some embodiments, the computingdevice 500 may have different processors, operating systems, and inputdevices consistent with the device.

In other embodiments the computing device 500 is a mobile device, suchas a Java-enabled cellular telephone or personal digital assistant(PDA), a smart phone, a digital audio player, or a portable mediaplayer. In some embodiments, the computing device 500 includes acombination of devices, such as a mobile phone combined with a digitalaudio player or portable media player.

As shown in FIG. 2C, the central processing unit 521 may includemultiple processors P1, P2, P3, P4, and may provide functionality forsimultaneous execution of instructions or for simultaneous execution ofone instruction on more than one piece of data. In some embodiments, thecomputing device 500 may include a parallel processor with one or morecores. In one of these embodiments, the computing device 500 is a sharedmemory parallel device, with multiple processors and/or multipleprocessor cores, accessing all available memory as a single globaladdress space. In another of these embodiments, the computing device 500is a distributed memory parallel device with multiple processors eachaccessing local memory only. In still another of these embodiments, thecomputing device 500 has both some memory which is shared and somememory which may only be accessed by particular processors or subsets ofprocessors. In still even another of these embodiments, the centralprocessing unit 521 includes a multicore microprocessor, which combinestwo or more independent processors into a single package, e.g., into asingle integrated circuit (IC). In one exemplary embodiment, depicted inFIG. 2D, the computing device 500 includes at least one centralprocessing unit 521 and at least one graphics processing unit 521′.

In some embodiments, a central processing unit 521 provides singleinstruction, multiple data (SIMD) functionality, e.g., execution of asingle instruction simultaneously on multiple pieces of data. In otherembodiments, several processors in the central processing unit 521 mayprovide functionality for execution of multiple instructionssimultaneously on multiple pieces of data (MIMD). In still otherembodiments, the central processing unit 521 may use any combination ofSIMD and MIMD cores in a single device.

A computing device may be one of a plurality of machines connected by anetwork, or it may include a plurality of machines so connected. FIG. 2Eshows an exemplary network environment. The network environment includesone or more local machines 502 a, 502 b (also generally referred to aslocal machine(s) 502, client(s) 502, client node(s) 502, clientmachine(s) 502, client computer(s) 502, client device(s) 502,endpoint(s) 502, or endpoint node(s) 502) in communication with one ormore remote machines 506 a, 506 b, 506 c (also generally referred to asserver machine(s) 506 or remote machine(s) 506) via one or more networks504. In some embodiments, a local machine 502 has the capacity tofunction as both a client node seeking access to resources provided by aserver machine and as a server machine providing access to hostedresources for other clients 502 a, 502 b. Although only two clients 502and three server machines 506 are illustrated in FIG. 2E, there may, ingeneral, be an arbitrary number of each. The network 504 may be alocal-area network (LAN), e.g., a private network such as a companyIntranet, a metropolitan area network (MAN), or a wide area network(WAN), such as the Internet, or another public network, or a combinationthereof.

The computing device 500 may include a network interface 518 tointerface to the network 504 through a variety of connections including,but not limited to, standard telephone lines, local-area network (LAN),or wide area network (WAN) links, broadband connections, wirelessconnections, or a combination of any or all of the above. Connectionsmay be established using a variety of communication protocols. In oneembodiment, the computing device 500 communicates with other computingdevices 500 via any type and/or form of gateway or tunneling protocolsuch as Secure Socket Layer (SSL) or Transport Layer Security (TLS). Thenetwork interface 518 may include a built-in network adapter, such as anetwork interface card, suitable for interfacing the computing device500 to any type of network capable of communication and performing theoperations described herein. An I/O device 530 may be a bridge betweenthe system bus 550 and an external communication bus.

According to various embodiments of the present invention, explorationand discovery technologies are directed toward discovering interestingphenomena (e.g., detecting and organizing data into topics) without theuser input—in other words, identifying information that is relevant tothe user without the user explicitly looking for this information.Embodiments of the present invention are also directed to organizing thedetected topics into a taxonomy or hierarchy. Categorizationtechnologies are generally focused on classifying documents (e.g., text,audio, and video) into predefined categories such as “all the calls inwhich a customer has asked to speak to a supervisor.”

FIG. 3 is a screenshot of a category distribution report according toone embodiment of the present invention. In this report, the voicecalls, customer-agent phone conversations (or interactions) that haveoccurred in the last 7 days have been classified into categories (e.g.,predefined categories) that represent the set of known reasons forcalls. In other embodiments, conversations are aggregated over differenttime periods (e.g., over the past day, over the past hour, over the pastmonth, since a particular date, or between two arbitrary dates) may beaggregated. In addition, in other embodiments, the interactions may belimited to particular communication channels, such as one or more oftelephone, email, chat, and social media, limited to interactions fromparticular contact centers, or limited to interactions from particulardepartments (e.g., sales or customer support).

FIG. 4 is a screenshot illustrating an interface for customizing anddefining predefined categories according to one embodiment of thepresent invention. Each predefined category can be defined as someBoolean expression of topics where each topic may be defined as a unionof phrases or words, thereby producing a set of categorizing rules usedto classify the interactions. For example, FIG. 4 illustrates thedefinition of the “Repeat Call or Contact” category, which is defined byinteractions having ‘Found topic “Repeat Calls” at least once withVery-Low strictness OR Found topic “Repeat Contacts” at least once withVery-Low strictness’. The “Repeat Calls” and “Repeat Contact” topics maybe triggered, for example, by detecting particular triggering eventssuch as a record of multiple calls from a particular phone number or byidentifying particular phrases in the interaction such as “thanks forcalling again”.

When one of these phrases of the Boolean expression is spoken in aconversation, various speech recognition technologies can recognize itin the audio. (One such technology is phrase recognition as described inU.S. Pat. No. 7,487,094 “System and method of call classification withcontext modeling based on composite words” the content of which isincorporated herein by reference) In other embodiments, the interactionsare conducted over other media (for example, text chat) and otherappropriate methods of detecting phrases are used. Upon detecting one ofthese phrases, it triggers the detection of topics to which thesephrases belong. The detection of topics feed the appropriatecategorizing rules matching the given category are triggered and theinteraction is labeled in accordance with the matching categories.

Therefore, according to one embodiment, the analytics server 45 cangenerate the category distribution report by counting the number ofinteractions within a given time period that fall within each category.

According to one embodiment of the present invention, an analyticsserver 45 provides a user with the ability to view or “explore” relatedwords, as illustrated, for example, in FIG. 5. A user can start from asingle word and explore the co-occurrence of the starting word withother words in various conversations. For instance, FIG. 5 depicts therelationship or co-occurrence of the word “credit” with other words inthe set of relevant calls.

The detected topics can be categorized and organized into a globaltaxonomy of topics, as illustrated. FIG. 6A depicts the relationship ofidentified topics in a large of documents as a taxonomy (e.g., “globaltaxonomy”) according to one embodiment of the present invention. FIGS.6B and 6C illustrate clusters within the larger taxonomy according toone embodiment of the present invention.

FIG. 7 is a flowchart illustrating a process for identifying topics andgenerating a taxonomy according to one embodiment of the presentinvention. Referring to FIG. 7, in operation 100, information isextracted from a collection of documents using linguistic rules. Theextracted information may be referred to herein as “fragments.” Inoperation 200, fragments are filtered, and, in operation 300, thefiltered fragments are clustered into base clusters. For example, baseclusters may include “payment plan” and “payment arrangement.” Inoperation 400, the base clusters are further semantically clusteredbased on semantic distance to automatically generate a semantichierarchy of “hyper clusters” or “parent categories.” Continuing theexample, the hyper clustering process my combine the “payment plan” and“payment arrangement” base clusters into one semantic “paymentarrangement” parent category (or hyper cluster).

According to various embodiments of the present invention, the semanticdistance may be computed based on semantic similarity and co-occurrenceanalysis. In addition, embodiments of the present invention are directedto tracking the development of topics and categories over time to detectif a topic is new or part of an existing taxonomy of a domain.

The fragment extraction operation 100, the filtering operation 200, theclustering operation 300, and the hyper clustering operation 400according to embodiments of the present invention will be described inmore detail below.

FIG. 8A is a flowchart illustrating a method for extracting fragmentsaccording to one embodiment of the present invention. FIG. 8B is a blockdiagram illustrating a fragment extracting module 45 a of the analyticsserver 45 for extracting fragments from interactions (e.g., text andtext transcriptions of audio) according to one embodiment of the presentinvention. Fragments are extracted from interactions by supplying theentire body of interactions (or the entire body of text) to the system(e.g., the analytics server 45 as shown in FIG. 1, which may be acomputer system 500 as shown in FIG. 2A, including the fragmentextracting module 45 a as shown in FIG. 1) configured to perform theclustering.

When the interactions being processed are the output of large-vocabularycontinuous speech recognition (LVCSR) in optional operation 102 (e.g.,as performed by speech analytics module 44), low confidence words may befiltered out before supplying the documents to the text to the fragmentextraction process so that only words with high confidence remain in thetext to be processed. In some embodiments, if the exploration is done onemail, chat, or other text, the entire text is used (e.g., withoutfiltering based on confidence).

In operation 104, “parts of speech” are identified out of the suppliedtext (e.g., the text of emails, chats, etc. or the output of an LVCSRsystem) by the PoS tagger 144. Table 1 provides an example of an inputpiece of text and an output in which various words or phrases arelabeled with their parts of speech:

TABLE 1 Input I am calling because I want to make a payment arrangementon my balance Output I/PRP am/VBP calling/VBG because/IN I/PRP want/VBPto/TO make/VB a/DT payment/NN arrangement/NN on/IN my/PRP balance/NN

Methods for automatically analyzing and tagging text with their parts ofspeech are well known to those of ordinary skill in the art. (See, e.g.,Toutanova, K. et al. “Feature-Rich Part-of-Speech Tagging with a CyclicDependency Network” NAACL 2003 Vol. 1, 173-180.)

In operation 106, manually extracted key fragments are automaticallylinguistically analyzed by the linguistic analysis module 146. Themanually extracted key fragments are extracted by, for example, anexpert (or “auditor”) who has highlighted particular phrases as beingsemantically relevant. For example, given a sentence “I am callingbecause I want to make a payment arrangement,” the auditor may mark thefragment “make a payment arrangement” as being particularly relevant.The auditor may also mark fragments that convey the reason forcontacting the contact center, fragments that reflect the resolution ofthe interaction, fragments that relate to events related to the processof the resolution (e.g., “let me transfer you to my supervisor” or“please hold while I transfer you to a connections representative”), andfragments that could otherwise be of interest to those analyzing theinteractions with the contact center.

In operation 106, optional automatic linguistic analysis of the manuallyextracted key fragments by a linguistic analysis module 146 generates aset of extraction rules. In some embodiments of the present invention,the extraction rules are generated by performing the linguistic analysismanually (e.g., by the auditor or other expert). For example, themanually extracted fragment “make a payment arrangement” can be analyzedbased on the parts of speech and can be represented as “make/VB a/DTpayment/NN arrangement/NN”. The sequence of parts of speech (or “PoSsequence”) of the above fragment would therefore be VB DT NN NN.

Generally, phrases spoken or written by people can have many syntacticvariations. For example, “make a payment arrangement” could also bephrased “urgently need to make payment arrangements.” In one embodiment,the auditor creates a generalized set of patterns rather than a sequenceof parts of speech (or “PoS”) tags. In addition, generalized patternsalso allow capture of similar structures having different semantics,which could also be important. For example, a fragment containing a verbfollowed by an object (a “Verb-Object fragment”) is a Verb Phrase (VP)followed by a Noun Phrase (NP). This Verb-Object can be expressed as aregular expression:Verb_Object=VP & NPwhere “&” signifies concatenation and where:VP=optRB & “(\w+/VB.?)+” & optRBoptRB=“(\w+/RB.?)*”NP=(“\w+/(DT|PRP\$?)”) & Adj & “(\w+/(NN.?))+”Adj=“(\w+/(JJ.??|VBN|VBG)?)*”

The above regular expression will also capture a potential adjective(Adj) before the noun and an adverb (RB) after the verb, as well as newfragments such as “quickly start the services.”

In addition, when the source of the text is speech, in some embodimentsthe automatic speech recognition system (ASR) and the extraction rulescan be tuned for use with speech recognition results. For example, ifarticles (e.g., “a,” “an,” and “the”) are often deleted or not presentin speech recognition results, they can be made optional in theextraction rules. In addition, if only speech recognition results abovea particular level of confidence are included in the text, thisconfidence may be tuned to optimize the process of fragment extraction.

In one embodiment of the present invention, all of the manuallyextracted fragments, along with their PoS tags, are supplied as input toa sequential pattern mining algorithm as described, for example, in U.S.patent application Ser. No. 13/952,459 “SYSTEM AND METHOD FORDISCOVERING AND EXPLORING CONCEPTS,” filed on Jul. 26, 2013, the entiredisclosure of which is incorporated herein by reference and U.S. patentapplication Ser. No. 13/952,470 “SYSTEM AND METHOD FOR DISCOVERING ANDEXPLORING CONCEPTS AND ROOT CAUSES OF EVENTS,” filed on Jul. 26, 2013,the entire disclosure of which is incorporated herein by reference. Byapplying this algorithm to the manually extracted fragments and theirPoS tags, the PoS sequences of interest (e.g., the extraction rules) canbe automatically extracted.

In another embodiment of the present invention, the manually extractedkey fragments can be used as a “gold standard.” For example, severalthousand manually extracted key fragments can be used to create a set ofextraction rules so as to optimize the precision and recall from thisstandard. FIG. 8C is a flowchart illustrating a method 170 forautomatically generating a set of extraction rules given a standard.After initially setting a set of extraction rules, the quality of therules evaluated by running the rules on the standard to generate a“precision” (or accuracy):

${precision} = \frac{tp}{{tp} + {fp}}$and a “recall” (or detection rate):

${recall} = \frac{tp}{{tp} + {fn}}$where “tp” stands for “true positive,” “fp” stands for “false positive,”and “fn” stands for “false negative.”

The precision and recall can be combined to generate an “f-measure,”wheref=w ₁×precision+w ₂×recallwhere weights w₁ and w₂ can be adjusted to alter the relative influencesof precision and recall on the f-measure based on the usage scenario.

Referring to FIG. 8C, in operation 172, an initial set of extractionrules is generated (e.g., supplied from an external source such as anexpert manually identifying one or more extraction rules of interest).Operation 174 computes the precision and recall of the rules as appliedto the manually extracted key fragments. The computed precision andrecall are then compared, in operation 176, against a threshold value toevaluate whether or not the rules are good enough. If not, then inoperation 178 sequences of parts of speech that are missing from therules but present in the manually extracted key fragments areidentified. In operation 180, the identified missing part of speechsequences are added to the set of extraction rules and the processrepeats from operation 174 with the updated set of extraction rules. Themethod 170 updates the extraction rules based on missing parts of speechsequences until the precision and recall values meet the threshold, atwhich point the computed extraction rules are output and the processends.

In some embodiments of the present invention, rather than using manuallyextracted key fragments from among the documents, the supplied input isa set of sentences that are known to be of interest. The method 170 canalso be applied to this set of sentences, with the exception that onlyrecall (and not precision) can be used to evaluate the quality of theextraction rules.

Referring again to FIGS. 8A and 8B, in operation 108, the fragmentextraction module 148 extracts fragments from the tagged text using theextraction rules. Continuing the above example, given the identifiedextraction rule of VB DT NN NN and given some set of tagged text thatincludes: “run/VB a/DT credit/NN check/NN” and “moving/VB any/DT gas/NNappliances/NNS”, both of these fragments would be extracted from thetagged text because both match the VB DT NN NN extraction rule.

The fragments that were extracted in operation 100 are filtered inoperation 200. The fragment extraction process in operation 100 willoften include fragments that match an extraction rule (e.g., satisfy aparticular order of parts of speech), but are not informative. Many ofthese “false accepts” can be filtered out in operation 200 based on thelow frequency with which these fragments appear (e.g., only fragmentsthat appear frequently are not filtered out).

In addition, some fragments can be filtered out based on having a lowinverse document frequency (IDF), where the inverse document frequency(IDF) of a word is used to measure the saliency of word w, and thesaliency of the fragment is given by the square of the sum of the IDFsof each of the words in the fragment:

${{IDF}(w)} = {\log( \frac{N}{{DF}(w)} )}$${{Saliency}({fragment})} = {\sum\limits_{w \in {fragment}}{{IDF}(w)}}$where N is the total number of documents in the collection and DF(w) isthe number of documents in which the word w appears.

However, some fragments have high IDF but may still be uninformative.For example, in some contexts a fragment like “have your phone numberplease” might be a fragment that matches one of the PoS extractionrules, but bears no information because almost every interaction with aparticular contact center might ask the caller for his or her phonenumber for callback purposes.

In one embodiment of the present invention, a stop list of fragments canbe used to further filter the extracted fragments, in which fragmentsthat appear in the stop list are removed from the set of fragments to beconsidered. For example, in one embodiment the stop list includes a listof words and a fragment is filtered out (or removed) if all of the wordsin the fragment are on the stop list (e.g., if the fragment is made uponly of words that are in the stop list).

In one embodiment of the present invention, the stop list can alsoinclude one or more regular expressions where, if a fragment matches anyone of the regular expressions, the fragment is removed. For example,the regular expression “*expedite*call*” would match fragments “toexpedite your call” and “to expedite this call”, both of which areuninformative fragments in the context of a contact center.

In one embodiment of the present invention, the filtering process 200only filters based on particular PoS tags. For example, the filteringprocess may only determine whether all of the words in a fragment taggedas nouns are in the stop list. In this example, if all of the nouns inthe fragment are in the stop list, then the fragment is filtered out. Inanother example, only the verb phrases are analyzed by the filteringprocess.

According to one embodiment, fragments that are semantically related aregrouped together (or clustered) as conveying the same idea in operation300. FIG. 9A is a flowchart illustrating a method for generating baseclusters of filtered fragments according to one embodiment of thepresent invention.

Clustering is a machine learning technique that can be used to takefragments as input and to cluster the fragments together when theimportant portions of the fragments are appear to be the similar or thesame. Each one of these clusters is a concept (or topic or cluster) asmentioned above.

FIG. 9B is a flowchart illustrating a method for clustering fragmentsaccording to one embodiment of the present invention. In operation 312,the clustering module 45 b of the analytics server 45 selects a subsetof the filtered fragments to serve as centers (or “templates”) for theclusters. In operation 314, the similarity between each of the remainingfragments (fragments that were not selected to be templates) and each ofthe templates is computed. In operation 316, the fragments are assignedto clusters based on the computed similarities (e.g., each fragment isassigned to a cluster corresponding to the template that is most similarto the fragment). In operation 318, the clustering module 45 b removesclusters that have few or no fragments attached to them. These templates(and their associated clusters) are removed and added to the pool ofunassigned fragments. In operation 320, the clustering module 45 bdetermines if some set of ending conditions have been met (e.g., if allthe fragments have been tried as templates or if a certain number ofiterations of the algorithm have been executed). If not, then theprocess repeats with the selection of new sentences to serve asadditional templates in operation 312. After ending conditions have beenmet, the clustered fragments are output as base clusters.

To make the clustering faster, in one embodiment, only the most salientfragments are used. As such, in some embodiments of the presentinvention, the fragments are pruned by sorting the fragments by saliencyand discarding the fragments with low saliency relative to the top ones.For example, in one embodiment, fragments with less than 5% of thesaliency of the top ones are removed from consideration. The fragmentsare clustered to group together similar fragments that differ from oneanother only by less-salient words. The similarity of fragments can bemeasured based on various text mining measures, and is described in moredetail below.

The saliency of each cluster may be computed based on text miningmeasures. According to one embodiment, the saliency of a cluster isconstructed from a weighted sum of the saliencies of the fragments ofthe cluster:

${S({Cluster})} = {\sum{\frac{{fragment}\mspace{14mu}{weight}}{{cluster}\mspace{14mu}{size}}{{Saliency}({fragment})}}}$

In a manner similar to that described for sentence pruning, in oneembodiment, only the top clusters will be presented to the user andclusters with lower saliencies may be pruned away.

Referring back to FIG. 9A, in operation 330, the clustering module 45 blabels the base clusters obtained in operation 310. According to oneembodiment, the base clusters are automatically labeled with words orphrases that describe the given base clusters. For example, given a basecluster that contains the fragments “have a medical baseline,” “iscalled medical baseline,” and “we also offer a medical baseline,” onewould expect this cluster automatically to be labeled “medicalbaseline.”

FIG. 9C is a flowchart illustrating a method for naming clustersaccording to one embodiment of the present invention. In operation 332,given a base cluster to be named, the clustering module 45 b of theanalytics server 45 extracts noun phrases from the base cluster'sfragments. Stems from the noun phrases are then extracted in operation334, the stems of stop words are, optionally, removed from the extractednoun phrases, and the distribution of the remaining noun phrases in thefragments is computed. The result is a set of probability distributionsP(stem|cluster) (or rate of appearance of stems of noun phrases in agiven cluster, which may be referred to as a “stem distribution”) forall of the stems in the noun phrases of the clusters. Noun phraseshaving a probability below a threshold value are removed.

Given the list of stems and the probability distributions calculated inoperation 334, in operation 336 the clustering module 45 b attempts toidentify a noun phrase that contains all of the stems from the list ofstems (or a noun phrase having the highest probability of appearing inthe cluster according to the stem distribution). If such a phraseexists, then in operation 338 this noun phrase is output as the labelfor the given base cluster. If more than one such noun phrase exists,then take the noun phrase having higher frequency as the label.

If no such phrase is found, then in operation 340, the clustering module45 b determines if more stems are available from the list. If so, thenthe stem having lowest probability distribution is removed from the listin operation 342 and the process repeats with operation 336 to attemptto identify a stem that contains all of the remaining noun phrases. Ifno more stems are available, then the method fails to label the givenbase cluster.

Through the labeling process described above with respect to operation340, two or more different base clusters may be labeled with the samelabel. In such a case, according to one embodiment, in operation 350,the clustering module 45 b merges base clusters having the same name.The new base cluster will contain all of the fragments from all of thefragments having the same label, and this new base cluster will have thesame label as the merged base clusters.

In operation 370, according to one embodiment of the present invention,some of the base clusters will be removed or omitted from the finalresult. Base clusters can be omitted according to several rulesincluding: omitting base clusters that the labeling operation 330 wasnot able to label; omitting base clusters where all of the words of thelabel are in a filtering list; and omitting a base cluster when theentropy of the stems of the base cluster's noun phrases is greater thana threshold and is greater than the entropy of the stems of the basecluster's verb phrases.

FIG. 10A is a flowchart illustrating a method for hyper clustering baseclusters according to one embodiment of the present invention. FIG. 10Bis a flowchart illustrating a method for calculating a semantic distancebetween two base clusters according to one embodiment of the presentinvention. FIG. 10C is a block diagram illustrating a system for hyperclustering base clusters according to one embodiment of the presentinvention.

Referring to FIGS. 10A and 10C, after clustering the fragments inoperation 340, the semantic distance module 452 of the hyper clusteringmodule 45 c of the analytics server 45 clusters the base clusters intohyper clusters in operation 400 to generate a semantic hierarchy.Generally, when performed on a typical collection of documents relatedto interactions in a contact center, operation 300 will generate a setof several thousand base clusters, each with a corresponding size andsaliency measure (described in more detail below).

When considering a large set of clusters, one might consider organizing(or further clustering) these clusters into a second hierarchy (or hypercluster) of semantics. For example, Table 2 below illustrates a set ofbase clusters and their associated hyper clusters:

TABLE 2 Hyper Cluster Base Cluster Label Medical Discount Applicationmedical baseline medical condition medical equipment doctor bill PaymentArrangement payment arrangement payment plan remaining balance rest ofbalance

According to one embodiment, the second level hierarchy (or hyperclusters) uses another source of semantic information to group clusters,rather than relying on the frequency of identical words betweendifferent documents (or interactions). In particular, according toembodiments of the present invention the additional semantic informationcan come from existing systems for comparing the semantic similarity ofwords such as word2vec (see, e.g., Mikolov et al. “DistributedRepresentations of Words and Phrases and their Compositionality” NIPS2013 3111-3119), DISCO (see, e.g. Kolb, P. “DISCO: A MultilingualDatabase of Distributionally Similar Words,” KONVENS 2008, SupplementaryVolume 37-44), and WordNet® (see, e.g., Fellbaum, C. “WordNet andwordnets.” Brown, Keith et al. (eds.), Encyclopedia of Language andLinguistics, Second Edition, Oxford: Elsevier, 665-670). In addition,semantic information about the similarity of base clusters can beobtained based on the co-occurrence of the base clusters. These measureswill be discussed in more detail below.

Given a measure of semantic similarity (D_(S)) and a measure ofco-occurrence (D_(CO)) between two base clusters c₁ and c₂, according toone embodiment of the present invention, the semantic similarity (or“distance”) between the two base clusters (D(c₁,c₂)) is defined as:D(c ₁ ,c ₂)=α×D _(S)(c ₁ ,c ₂)+(1−α)×D _(CO)(c ₁ ,c ₂)

The measure of co-occurrence can be computed using, for example,point-wise mutual information (PMI) as described in, for example Bouma,G. “Normalized (Pointwise) Mutual Information in CollocationExtraction.” Proc. of GSCL 2009.

According to one embodiment, the value of the constant α in the aboveequation is computed by calibrating the value against a manuallygenerated “gold standard.” According to another embodiment, a is tunedby trial and error during the clustering process.

According to one embodiment, the hyper clustering module 45 c computesthe semantic similarity D_(S) in operation 410 using a source forsemantic similarity information such as WordNet®, DISCO, and word2vec.For example, DISCO and WordNet® provide databases of similarities andreturns a second order similarity (or semantic similarity) when suppliedwith two words or may return merely a binary “similar” or “not similar”result. As another example, word2vec merely provides a software packagethat is trained by the user. In the case of word2vec, the input text(e.g., the text output of the speech recognition system and the fulltext of the chat transcripts, social media interactions, emails, etc.associated with the contact center) or various portions thereof may besupplied to train the word2vec models.

As such, according to one embodiment of the present invention, andreferring to FIG. 10B, the semantic similarity D_(S) between twoclusters c₁ and c₂ can be calculated by computing or looking up thesemantic similarity of the corresponding labels of the clusters c₁ andc₂ using WordNet, DISCO, or word2vec in operation 412. The base clusterlabels are an appropriate point of comparison for the semantic distancebetween the clusters because the base clusters are chosen based on theirability to represent the fragments contained within their respectivebase clusters.

The co-occurrence measure D_(CO) can be calculated in operation 414 in avariety of ways according to various embodiment of the presentinvention. After the clustering process, every base cluster includes aset of fragments, where these fragments were extracted various documents(or interactions, e.g., transcripts of chat and social mediainteractions, emails, automatic speech-recognized voice calls, etc.). Assuch, for every base cluster, the fraction of interactions in which thiscluster occurs can be calculated as:

${P( c_{0} )} = \frac{N( c_{0} )}{N}$where N is the total number of interactions and N(c₀) is the number ofinteractions in which a fragment from cluster c₀ appears. As such, P(c₀)represents the probability of finding a fragment from cluster c₀ in anyinteraction with the contact center.

In addition, the probability that two clusters appear together can becalculated as follows:

${P( {c_{0},c_{1}} )} = \frac{N( {c_{0},c_{1}} )}{N}$

From these to probabilities, P(c₀) and P(c₀,c₁), the first ordersimilarity between clusters can be computed based on the pointwisemutual information (pmi) or the normalized pointwise mutual information(npmi) between them:

${{pmi}( {c_{0},c_{1}} )} = {\log\frac{P( {c_{0},c_{1}} )}{{P( c_{0} )}{P( c_{1} )}}}$${{npmi}( {c_{0},c_{1}} )} = \frac{{pmi}( {c_{0},c_{1}} )}{- {\log( {P( {c_{0},c_{1}} )} )}}$

In one embodiment D_(CO)(c₀,c₁)=pmi(c₀,c₁).

In another embodiment, D_(CO)(c₀,c₁)=npmi(c₀,c₁).

For two clusters that always occur together, P(c₀,c₁)=P(c₀)=P(c₁) andnpmi(c₀,c₁)=1, the maximal value of nmpi. For two clusters that neverappear together, P(c₀,c₁)=0 and npmi(c₀,c₁)=−1, the minimal value ofnmpi.

As such, in operation 416, the computed values D_(S) and D_(CO) arecombined (e.g., in one embodiment D(c₁,c₂)=α×D_(S)(c₁,c₂)(1−α)×D_(CO)(c₁,c₂)) and output.

Given a distance matrix D(c₁,c₂) between every pair of base clusters c₁and c₂, according to one embodiment of the present invention, thecluster clustering module 454 of the hyper clustering module 45 capplies a clustering algorithm in operation 430 to determine whether ornot two base clusters belong to the same hyper cluster. This clusteringalgorithm can be a modified version of the algorithm described above forclustering fragments into base clusters, but with the substitution of,for example, the selection of a random base cluster in operation 312,the use of distances D(c₁,c₂) for the computation of similarity inoperation 314, and the assignment of base clusters to hyper clusters inoperation 316. Other clustering techniques could also be used, such asthe Chinese whispers algorithm described in, for example, Biemann, C.“Chinese whispers: an efficient graph clustering algorithm and itsapplication to natural language processing problems.” Proc. of the FirstWorkshop on Graph Based Methods for Natural Language Processing (2006)73-80.

In operation 450, each of the hyper clusters can then be labeled in amanner similar to that described above with respect to labeling a singlebase cluster in operation 330 (see, e.g., FIG. 9C), but by selecting thebest noun phrase from all the noun phrases of all of the base clustersof the hyper clusters rather than the noun phrases of a single basecluster (as would be the case when labeling a single base cluster).

The resulting set of hyper clusters of base clusters can then betransformed by the GUI module 45 d of the analytics server 45 and outputfor display on a user terminal such as an agent device 38 or othergeneral purpose computing device. FIG. 6A illustrates the display ofmultiple hyper clusters (and possible a number of base clusters)according to one embodiment of the present invention and FIGS. 6B and 6Crespectively show a “zoomed in” view of a single hyper cluster (e.g.,the hyper clusters “payment” and “access issues” respectively), whereFIGS. 6B and 6C illustrate base clusters grouped within their parenthyper clusters. According to one embodiment, further zooming in on anyof the base clusters would show the underlying fragments containedwithin those base clusters.

The various hyper clusters and base clusters shown in FIGS. 6A, 6B, and6C are represented as circles of various sizes and spaced apart atvarious distances, where the relative sizes signify the relativefrequency with which fragments contained within those clusters or hyperclusters appeared in the data set of interest, and wherein the distancebetween the circles represent their semantic distances from one another,so that frequently discussed topics appear larger in the display andrelated concepts appear close to one another.

According to some embodiments of the present invention, the progress ofclusters can be tracked over time to help detect changes in the data.For example, embodiments of the present invention are directed tomethods of detecting events that are emerging (e.g., events that did notoccur until a particular period in time and then suddenly startoccurring) and detecting events that stop happening.

According to one embodiment of the present invention, changes inclusters (“movers and shakers”) between two different periods of time(e.g., Period A and Period B) can be detected using substantially thesame process of base clustering as described above with respect tooperation 300, but performed on all the interactions contained in PeriodA along with all the interactions contained in Period B while keepingtrack of which period (Period A or Period B) the fragments containedwithin the clusters came from. After clustering for both these twoperiods combined, the number of fragments originating from each period(or both periods) is extracted, thereby allowing the detection of newtopics and the disappearance of other topics. For example, if Period Bis later in time than Period A, and all of the fragments containing thewords “credit card overcharge” appear in Period B, then the system woulddetect that a new topic related to “credit card overcharges” appeared inPeriod B.

One aspect of embodiments of the present invention is directed totracking particular topics or categories over time. As discussed above,the labeling of base clusters in operation 330 is performedautomatically based on the fragments contained in the clusters andpossibly based on the randomly chosen template fragments. Therefore, theprocesses described above may generate base clusters and hyper clusterswhose names may change over time, though they represent substantiallythe same topics.

According to one embodiment, to track topics over time while accountingfor this potential drift in cluster labels, an initial set of data forclustered periods is generated. For example, in one embodiment, trackingbegins after having completed the clustering process for ten days. Eachof these periods (e.g., each of these ten days) includes a set of hyperclusters, where each hyper cluster contains a set of base clusters. Boththe hyper clusters and the base clusters, however, can be represented bya distribution over words. For example, the “payment arrangement” hypercluster may contain base clusters “payment arrangement” and “paymentplan.” Due to the presence of other fragments that may also exist inthese base clusters, the word distribution in these clusters may looklike Table 3 below:

TABLE 3 Word Probability Payment 0.4 Plan 0.1 Arrangement 0.1 Get 0.1Have 0.1 Cancel 0.1 Arrangement 0.1

According to one embodiment of the present invention, a worddistribution can be represented by a separate noun and verbdistribution, and more weight is given to the noun than to the verb.

Once word distributions are computed, in order to track hyper clustersover time, embodiments of the present invention can compare whether twoclusters are the same by determining whether their divergence (forexample Kullback-Leibler divergence) meets a particular thresholdrequirement.

Embodiments of the invention can be practiced as methods or systems.Computer devices or systems including, for example, a microprocessor,memory, a network communications device, and a mass storage device canbe used to execute the processes described above in an automated orsemi-automated fashion. In other words, the above processes can be codedas computer executable code and processed by the computer device orsystem.

It should also be appreciated from the above that various structures andfunctions described herein may be incorporated into a variety ofapparatus. In some embodiments, hardware components such as processors,controllers, and/or logic may be used to implement the describedcomponents or circuits. In some embodiments, code such as software orfirmware executing on one or more processing devices may be used toimplement one or more of the described operations or components.

As would be understood by one of ordinary skill in the art, theprocesses described herein and as illustrated in the flowcharts in thefigures may be implemented by instructions stored in computer memory tocontrol a computer processor to perform the described functions. Inaddition, steps and operations shown in the flowchart do not need to beexecuted in the order shown and person of ordinary skill in the art atthe time the invention was made the order of the steps and operationsperformed may vary without deviating from or substantially altering theunderlying technique.

While the present invention has been described in connection withcertain exemplary embodiments, it is to be understood that the inventionis not limited to the disclosed embodiments, but, on the contrary, isintended to cover various modifications and equivalent arrangementsincluded within the spirit and scope of the appended claims, andequivalents thereof.

What is claimed is:
 1. A method for automatically detecting andcategorizing topics in a plurality of interactions between customers andagents of a contact center during one or more time periods, theinteractions comprising a plurality of phrases, the method comprising:extracting a plurality of fragments from the plurality of interactionsin accordance with one or more extraction rules by a processor of ananalytics system configured to automatically detect and categorizetopics in the plurality of interactions, at least one of the extractionrules comprising a part of speech sequence, the extraction rules beingautomatically generated based on a set of key fragments; filtering, bythe processor, the plurality of fragments to generate a filteredplurality of fragments; clustering, by the processor, the filteredfragments into a plurality of base clusters; clustering, by theprocessor, the plurality of base clusters into a plurality of hyperclusters; and outputting, by the processor, a hierarchy of concepts inaccordance with the filtered fragments clustered into the base clustersand the base clusters clustered into the hyper clusters, the baseclusters corresponding to topics detected in the interactions occurringduring the one or more time periods, and the hyper clusterscorresponding to categorizations of the topics.
 2. The method of claim1, wherein the extracting the plurality of fragments from the pluralityof interactions comprises: receiving, by the processor, textcorresponding to the plurality of interactions; tagging, by theprocessor, portions of the text based on parts of speech; andextracting, by the processor, fragments from the text in accordance withthe one or more extraction rules.
 3. The method of claim 2, wherein theinteractions comprises speech between customers and agents of a contactcenter, and wherein the text corresponding to the plurality ofinteractions comprises an output of an automatic speech recognitionengine, the output being generated by processing speech of at least oneof the customers and speech of at least one of the agents from at leastone of the plurality of interactions through the automatic speechrecognition engine.
 4. The method of claim 1, further comprisinglabeling, by the processor, a base cluster of the plurality of baseclusters, the labeling comprising: extracting, by the processor, aplurality of noun phrases from the base cluster; computing, by theprocessor, a distribution of probabilities of stems of the noun phrases;and identifying, by the processor, a label noun phrase of the nounphrases, the label noun phrase having a highest probability based on thestem distribution.
 5. The method of claim 1, wherein the clustering theplurality of base clusters into the plurality of hyper clusterscomprises: computing, by the processor, a plurality of semanticdistances between pairs of the plurality of base clusters; andclustering, by the processor, the base clusters into the hyper clustersin accordance with the semantic distances.
 6. The method of claim 5,wherein the plurality of semantic distances are computed based onsemantic similarities of the pairs of base clusters and co-occurrence offragments in the pairs of base clusters.
 7. The method of claim 1,further comprising: generating, by the processor, a visualization of theplurality of topics as organized into a hierarchy based on the pluralityof hyper clusters, at least one of the hyper clusters comprising aplurality of corresponding base clusters, each of the base clusterscomprising a corresponding plurality of fragments.
 8. An analyticssystem of a contact center, the analytics system being configured toautomatically detect and categorize topics in a plurality ofinteractions between customers and agents of a contact center during oneor more time periods, the interactions comprising a plurality ofphrases, the system comprising: a processor; and a memory, wherein thememory has stored thereon instructions that, when executed by theprocessor, cause the processor to: receive a plurality of interactionsbetween customers and agents of the contact center; extract a pluralityof fragments from the plurality of interactions in accordance with oneor more extraction rules, at least one of the extraction rulescomprising a part of speech sequence, the extraction rules beingautomatically generated based on a set of key fragments; filter theplurality of fragments to generate a filtered plurality of fragments;cluster the filtered fragments into a plurality of base clusters;cluster the plurality of base clusters into a plurality of hyperclusters; and output a hierarchy of concepts in accordance with thefiltered fragments clustered into the base clusters and the baseclusters clustered into the hyper clusters, the base clusterscorresponding to topics detected in the interactions occurring duringthe one or more time periods, and the hyper clusters corresponding tocategorizations of the topics.
 9. The system of claim 8, wherein theinstructions that cause the processor to extract the plurality offragments from the plurality of interactions comprise instructions that,when executed by the processor, cause the processor to: receive textcorresponding to the plurality of interactions; tag portions of the textbased on parts of speech; and extract fragments from the text inaccordance with the one or more extraction rules.
 10. The system ofclaim 9, wherein the interactions comprises speech between customers andagents of a contact center, and wherein the text corresponding to theplurality of interactions comprises an output of an automatic speechrecognition engine, the output being generated by processing speech ofat least one of the customers and speech of at least one of the agentsfrom at least one of the plurality of interactions through the automaticspeech recognition engine.
 11. The system of claim 8, wherein the memoryfurther has stored thereon instructions that, when executed by theprocessor, cause the processor to label a base cluster of the pluralityof base clusters by: extracting a plurality of noun phrases from thebase cluster; computing a distribution of probabilities of stems of thenoun phrases; and identifying a label noun phrase of the noun phrases,the label noun phrase having a highest probability based on the stemdistribution.
 12. The system of claim 8, wherein the instructions thatcause the processor to cluster the plurality of base clusters into theplurality of hyper clusters comprise instructions that, when executed bythe processor, cause the processor to: compute a plurality of semanticdistances between pairs of the plurality of base clusters; and clusterthe base clusters in into the hyper clusters in accordance with thesemantic distances.
 13. The system of claim 12, wherein the instructionsthat cause the processor to compute the plurality of semantic distancesbetween the pairs of the base clusters comprise instructions to computea semantic distance of the semantic distances based on semanticsimilarities between the pairs of the base clusters and co-occurrence offragments in the pairs of the base clusters.
 14. The system of claim 8,wherein the memory further has stored thereon instructions that, whenexecuted by the processor, cause the processor to generate avisualization of a plurality of topics as organized into a hierarchybased on the plurality of hyper clusters, at least one of the hyperclusters comprising a plurality of corresponding base clusters, each ofthe base clusters comprising a corresponding plurality of fragments. 15.The method of claim 1, wherein the one or more extraction rules aregenerated by: applying sequential pattern mining to one or more keyfragments.
 16. A method for automatically detecting and categorizingtopics in a plurality of interactions between customers and agents of acontact center during one or more time periods, the interactionscomprising a plurality of phrases, the method comprising: extracting aplurality of fragments from the plurality of interactions in accordancewith one or more extraction rules by a processor of an analytics systemconfigured to automatically detect and categorize topics in theplurality of interactions, at least one of the extraction rulescomprising a part of speech sequence comprising a verb; filtering, bythe processor, the plurality of fragments to generate a filteredplurality of fragments; clustering, by the processor, the filteredfragments into a plurality of base clusters; clustering, by theprocessor, the plurality of base clusters into a plurality of hyperclusters; and outputting, by the processor, a hierarchy of concepts inaccordance with the filtered fragments clustered into the base clustersand the base clusters clustered into the hyper clusters, the baseclusters corresponding to topics detected in the interactions occurringduring the one or more time periods, and the hyper clusterscorresponding to categorizations of the topics, wherein the one or moreextraction rules are generated by: generating an initial set ofextraction rules; computing a precision and a recall of the set ofextraction rules on a set of key fragments; comparing the computedprecision and recall against one or more threshold values; in responseto determining that the computed precision and recall fail to satisfythe one or more threshold values: identifying one or more sequences ofparts of speech that are missing from the set of extraction rules andpresent in the key fragments; and adding the identified one or moresequences of parts of speech to the set of extraction rules; and inresponse to determining that the computed precision and recall satisfythe one or more threshold values, outputting the set of extractionrules.